Transcription and annotation of a Japanese accented spoken corpus of L 2 Spanish for the development of CAPT applications
نویسنده
چکیده
This paper addresses the process of transcribing and annotating spontaneous non-native speech with the aim of compiling a training corpus for the development of Computer Assisted Pronunciation Training (CAPT) applications, enhanced with Automatic Speech Recognition (ASR) technology. To better adapt ASR technology to CAPT tools, the recognition systems must be trained with non-native corpora transcribed and annotated at several linguistic levels. This allows the automatic generation of pronunciation variants, new L2 phoneme units, and statistical data about the most frequent mispronunciations by L2 learners. We present a longitudinal non-native spoken corpus of L2 Spanish by Japanese speakers, specifically designed for the development of CAPT tools, fully transcribed at both phonological and phonetic levels and annotated at the error level. We report the results of the influence of oral proficiency, speaking style and L2 exposition in pronunciation accuracy, obtained from the statistical analysis of the corpus.
منابع مشابه
Intermediate phonetic realizations in a Japanese accented L2 Spanish corpus
This paper addresses the issue of manual transcription of non native speech in an attempt to establish rule-based strategies for labelling intermediate realizations. The problems of transcribing non canonical realizations of L2 sounds which present shared features of the target (Spanish) and the source language (Japanese) will be considered. We introduce a Japanese accented non native L2 Spanis...
متن کاملiCALL corpus: Mandarin Chinese spoken by non-native speakers of European descent
We present iCALL, a speech corpus designed to evaluate Mandarin Chinese pronunciation patterns of non-native speakers of European descent, developed at the Institute for Infocomm Research (IR) in Singapore. To the best of our knowledge, iCALL is larger than any reported non-native corpora to date in terms of utterance number, duration, and number of speakers: iCALL consists of 90,841 utterances...
متن کاملPhonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish
Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...
متن کاملPhonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish
Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...
متن کاملCultural Influence on the Expression of Cathartic Conceptualization in English and Spanish: A Corpus-Based Analysis
This paper investigates the conceptualization of emotional release from a cognitive linguistics perspective (Cognitive Metaphor Theory). The metaphor weeping is a means of liberating contained emotions is grounded in universal embodied cognition and is reflected in linguistic expressions in English and Spanish. Lexicalization patterns which encapsulate this conceptualization i...
متن کامل